Confidence for Speaker Diarization using PCA Spectral Ratio
نویسندگان
چکیده
Confidence scoring is an important component in speaker diarization systems, both for offline speech analytics and for online diarization that are required to produce the speaker segmentation from very little audio. This paper proposes a confidence measure for speaker diarization based on the spectral ratio of the eigenvalues of the Principal Component Analysis (PCA) transformation computed on the pre-segmented audio before diarization is performed on the conversation. We tested our method on two-speaker data and our results show the effectiveness of the PCA’s spectral ratio confidence measure for both offline and online diarization. We compare and contrast our proposed confidence measure with other clustering validation methods that provide a quantitative measure of the segmentation quality but are calculated on the segmented data after diarization is performed, and with a related approach that extracts a confidence from the PCA of the pre-segmented audio.
منابع مشابه
Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features
In this paper, we present a two-pass Information Bottleneck (IB) based system for speaker diarization which uses meetingspecific artificial neural network (ANN) based features. We first use IB based speaker diarization system to get the labelled speaker segments. These segments are re-segmented using Kullback-Leibler Hidden Markov Model (KL-HMM) based re-segmentation. The multi-layer ANN is the...
متن کاملSpeaker Diarization Based on Gmm Supervectors and Unsupervised Intra-speaker Variability Modeling
This paper presents a novel framework for speaker diarization. Audio is parameterized by a sequence of GMM-supervectors representing overlapping short segments of speech. Session dependent intra-session intra-speaker variability is estimated online in an unsupervised manner, and is removed from the supervectors using Nuisance Attribute Projection (NAP) The supervectors are then projected using ...
متن کاملTrainable speaker diarization
This paper presents a novel framework for speaker diarization. We explicitly model intra-speaker inter-segment variability using a speaker-labeled training corpus and use this modeling to assess the speaker similarity between speech segments. Modeling is done by embedding segments into a segment-space using kernel-PCA, followed by explicit modeling of speaker variability in the segment-space. O...
متن کاملExploiting Intra-Conversation Variability for Speaker Diarization
In this paper, we propose a new approach to speaker diarization based on the Total Variability approach to speaker verification. Drawing on previous work done in applying factor analysis priors to the diarization problem, we arrive at a simplified approach that exploits intra-conversation variability in the Total Variability space through the use of Principal Component Analysis (PCA). Using our...
متن کاملIntegration of TDOA features in information bottleneck framework for fast speaker diarization
In this paper we address the combination of multiple feature streams in a fast speaker diarization system for meeting recordings. Whenever Multiple Distant Microphones (MDM) are used, it is possible to estimate the Time Delay of Arrival (TDOA) for different channels. In [9], it is shown that TDOA can be used as additional features together with conventional spectral features for improving speak...
متن کامل